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Abstract 

Inference on an extreme- value copula usually proceeds via its Pickands depen- 
dence function, which is a convex function on the unit simplex satisfying certain 
inequality constraints. In the setting of an iid random sample from a multivari- 
ate distribution with known margins and unknown extreme- value copula, an ex- 
tension of the Caperaa-Fougeres-Genest estimator was introduced by D. Zhang, 
M. T. Wells and L. Peng [Journal of Multivariate Analysis 99 (2008) 577-588]. 
The joint asymptotic distribution of the estimator as a random function on the 
simplex was not provided. Moreover, implementation of the estimator requires 
the choice of a number of weight functions on the simplex, the issue of their 
optimal selection being left unresolved. 

A new, simplified representation of the CFG-estimator combined with stan- 
dard empirical process theory provides the means to uncover its asymptotic 
distribution in the space of continuous, real-valued functions on the simplex. 
Moreover, the ordinary least-squares estimator of the intercept in a certain lin- 
ear regression model provides an adaptive version of the CFG-estimator whose 
asymptotic behavior is the same as if the variance-minimizing weight functions 
were used. As illustrated in a simulation study, the gain in efficiency can be 
quite sizeable. 
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1. Introduction 



Let Xi = (Xn, . . . , Xi p ), i € {1, ...,n.}, be iid random vectors from a 
p-variate, continuous distribution function F with multivariate extreme-value 
copula C: for u £ (0, . . . , 1)}, denoting the margins of F by F±, . . . , F p , 



C{u) = P{F l {X il ) < u u F p {X lp ) < u p ) = exp{-|y| A(y/\y\)} 



(1.1) 
1}, is 



where yj = -logix, and \y\ = \yi\ H h \y p \ 

The function A, whose domain is A p = {w <G [0, 1 1 p : wi + • • • + w v ■ 
called the Pickands dependence function of C, after Pickandsl ( 1981 ). 

Multivariate extreme-value copulas arise as the limits of copulas of vectors 
of component-w ise maxima of independent random samples ( Deheuvelsl . 1984 ; 
Galambod . [l987h . As a consequence, they coincide with the class of copulas of 
multivariate extreme- value or max-stablc distributions. Therefore, they provide 
models for dependence between extreme values that allow extrapolation beyond 
the support of the sample. It is then of interest to estimate the Pickands de- 
pendence function A. 

A necessary condition for C in (|1.1[) to be a copula is that A is convex and 
satisfies max(w 1 , . . . , w p ) ^ A(w) ^ 1 for all w <G A p ; in the bivariate case, this 
is also sufficient. In general, A should admit an integral representation in terms 
of a spectral measure. Some oth er pr operties of Pickands d ependence functions 
are studied in Obretenov (1991) and Falk and Reissl (2008). The upshot of all 
this is that the class of Pickands dependence functions is infinite-dimensional. 
This warrants the use of nonparamctric methods. 

Whereas most papers hitherto concentrated on the bivariate case, a non- 
parametric estima tor for general mu ltivariate Pickands dependence functions 
was introduced in Zhang et alj (2008). This estimator is in fact a mult i variat e 



generalization of the one by Caperaa-Fougeres-Genest (jCaperaa et al.l . 1 1997T ) 



The estimator was shown to be uniformly consistent and pointwisc asymptot- 
ically normal. However, the joint asymptotic distribution of the estimator as 
a random function on A p was not provided. Moreover, implementation of the 
estimator requires the choice of p weight functions Xj on A p , the issue of their 
optimal selection being left unresolved. 

Using a simplified representation of the above-mentioned estimator, we are 
able to uncover its asymptotic distribution in the space ^(Ap) of continuous, 
real-valued functions on A p . Moreover, we give explicit expressions for the 
weight functions Xj that minimize the pointwise asymptotic variance of the 
estimator. These optimal weight functions depend on the unknown distribution. 
We show that the CFG-estimator with estimated variance-minimizing weight 
functions can be implemented as the intercept estimator in a certain linear 
regression model via ordinary least squares. The OLS-cstimator is data-adaptive 
in the sense that the asymptotic distribution is the same as if the optimal weight 
functions were used. In a simulation study, the gain in efficiency is shown to be 

quite siz eable. 

Zhang et al. I (120081) . the setting here is that of a random sample from 



As 



a distribution whose margins are known and whose copula is an extreme-value 
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copula. It would be worthwhile to extend this to the c ase of unknown margins 
( Guillotte and Perron , l2008t iGenest and S oger 1 l2009h and the case that the 
copula of F is merely in the domain of attraction of an extreme- value copula 
( Caperaa and Fougered . 2000l ; Einmahl and Seger sl l2009h . 

The outline of our paper is as follows. The CFG-estimator is introduced in 
the next section, including its simplified representation and asymptotic distri- 
bution. The variance-minimizing weight functions arc computed in Section [3] 
together with an adaptive estimator based on ordinary least squares in a linear 
regression framework. Section 2] reports on a simulation study. The proofs of 
the results in Sections [5] and [3] are deferred to Appendices A and B, respectively. 



2. CFG-estimator and variants 

Let Xi = (Xn, . . . , Xi p ), i G {1, ...,n.}, be iid random vectors from a 
p-variate, continuous distribution function F with multivariate extreme-value 
copula C and Pickands dependence function A as in (II. II) . Let F±, . . . ,F P be 
the marginal distribution functions of F. Put Yi = (Yn, . . . , Yi p ) where 

Y VJ = -logFjiXtj) (2.1) 

for i G { 1 , . . . , n} and j G { 1, . . . , p} . The marginal distributions of the random 
variables Yij are standard exponential. The random vectors Y\, . . . , Y p are iid 
with common joint survivor function 

P(Y a >y u ...,Y ip > y p ) = C(e-y* e""') = cxp{-|y| A(y/\y\)}, 

for y G [O.oof \ {(0, . . . ,0)}, where \y\ = \ yi \ + ■ ■ ■ + \y p \. Put 

p y 

£ i (w)=f\^, weA p , iG{l,...,»}, (2.2) 

3 = 1 Wj 

with 'A' denoting minimum and with the obvious convention for division by 
zero; in particular, Ci{ e j) — Yij for the p standard unit vectors ei, . . . , e p in W . 
For w G A p and x > 0, we have 

P(&(tu) > x) = P{Y a > Wl x, ...,Y ip > w p x) = cxp{-.T A(w)}. (2.3) 

Hence the random variables £i (w), . . . , (w) constitute an independent random 
sample from the exponential distribution with mean l/A(w). It follows that the 
distribution of — log £;(«;) is Gumbcl with location parameter log A(w), whence 

£[-]Gg&(u>)]=logi4(ii>)+7, (2.4) 

the Euler-Mascheroni constant 7 = — r'(l) = 0.5772 . . . being the mean of the 
standard Gumbel distribution. This suggests the naive estimator 

1 " 

logi„(™) = Vlog^(u?)-7, meAj. (2.5) 
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The naive estimator is itself not a valid Pickands dependence function. For 
instance, it does not verify the vertex constraints A(ej) = 1 for all j <G { 1 , . . . , p} . 
A simple way to at least remedy this defect is by putting 

v 

\ogA G¥G {w) = log A n (w) -£)Aj(u>) logi„( ei ), w e A p , (2.6) 

where X\, . . . ,X P : A p — ► K are continuous functions verifying Aj(e^) = Sjk for 
all j, k € {1, ... ,p}. Continuity of the functions Xj is assumed merely to ensure 
that the resulting estimator is a continuous function of w as well. 

The s uperscript 'CFG' refer s to the bivarate estimator by Caperaa -Fougeres- 
Genes t in lCaperaa et al. (1992), generalize d to the multivariate case in Zhang et al 
(|2008t) . Actually, the original definition in lZhang etaH |2008j ) is 



logi£ WP (™)=$> 3 » 
where, with Y^j as in (|2.ip . 



*(l-z) 



dz, (2.7) 



A XiK 

I \k:k^j w k 



A 



k:k^j w k 



Moreover, in (|2.7p . the weight functions Xj are supposed to be nonnegative and 
to satisfy the additional constraint 



EM 

3=1 



If 



1, 



w e A„ 



(2.8) 



However, if J2 



holds, then actually the two estimators coincide, that is, 

A™ p (w)=A CFG (w), weA p . (2.9) 

The proof of (|2.9[) is essentially the same as the one in ISeger ] J2007h for the 
bivariate case, the key being that the integrals in (|2.7[) can be solved: 



l{Zy(tl») < z}- 



■ dz 



z(l-z) 

log[l - {(1 - Wj ) A Zijiw)}] + log(l - iwj-) - log{(l - Wj ) A Z VJ (w)} 

= logFy - log&(w). 



In our representation (|2.6[) . however, there is no reason whatsoever to restrict 
the weight functions to satisfy (|2.8[) . 

The asymptotics of the naive estimator and the CFG-es timator follow from 

standa rd e mpirical proces s theor y as presented for instance in lvan der Vaart and Wellner 
( 19961) and [van der Vaard(ll998l) . Let tf(A p ) denote the Banach space of contin- 
uous functions from A p into K equipped with the supremum norm. Convergence 
in distribution is denoted by the arrow 
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Proposition 2.1 (Naive estimator). Let Xi = (Xu, . . . ,Xi p ), i £ {1, . . . ,n}, 
be iid random variables from a p-variate, continuous distribution function F 
with multivariate extreme-value copula C and Pickands dependence function A. 
The naive estimator A n in (|2.5p satisfies 

sup \A n (w) — A(w)\ 0, n — > oo, almost surely, (2-10) 
and in ^(Ap), 

s/n(A n -A)-^A(, n^oo, (2.11) 
where £ is a centered Gaussian process with covariance function 

cov(C(w),C(«0) = cov(-log&(«), -log&(t»)), -u,™eA p , (2.12) 
raft ) as in ([2T2]) , 

Theorem 2.2 (CFG-estimator). //, m addition to the assumptions in Proposi- 
tion \2.1[ the functions X\, . . . , A p : A p — > R are continuous, then 

sup |A GFG (u;) — — » 0, rt — > oo, almost surely, (2-13) 
and m ^(Ap), 

\/n(A^ FG — A) ~+ Ar), n ^ oo, (2.14) 
where n is a centered Gaussian process defined by 

p 

r l W=CW-^A i WC(e j ), weA p , (2.15) 
i=i 

iirci/i £ as m Provosition \2. 1\ 

Remark 2.3 (Covariance function). The covariance function (|2.12p can be ex- 
pressed in terms of A as follows. An application of the identity log(x) = 
J {l(s ^i)- l(s ^ 1)} ds for x £ (0, oo) yields, by Fubini's theorem, 

cov(-log&(w), -log&(w)) 

= y o y o (^(6 W ^ «, &(«0 5 s *) - p(6W > s) p(6(i») > *))— j 

= / [exp{-£((^is) V (v x t), . . . , (w p s) V (V))} 
Jo Jo 

— exp{—sA(v)}exp{—tA(w)}]— — . 

s t 

where £(y) = \y\A(y/\y\) and \y\ = + ••• + \y p \. Replacing A by any 
estimator of it results in an estimator of the covariance function. However, a 
more practical way to estimate this function is by the sample covariance of the 
pairs (— log £,(-«), — log£j(iu)); see also (the proof of) Theorem 13.21 
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Remark 2.4 (Shape constraints). A further enhancement to the CFG-estimator 
is to replace it by the convex minorant of the function 



min[max{A„ {w), w%, . . . , w p }, 1], w € A p , 



as m 



Deheuvels] ( 199lh and Ijimenez et al" (|200ll ) for the bivariate case. Al 



though the resulting estimator would be a convex function respecting the bounds 
max(wi, . . . , w p ) ^ A(w) ^ 1, in case p ^ 3 this would still not guarantee it to 
be a genuine Pickands dependence function. Still other ways t o impose (some of) 
the s hape restrictions are spli ne smoothing under c onstr aints (jHall and Taividil . 
2000() . o rthogonal projection ( Fils-Villetard et al. . 2008 ). or Bayesian nonpara- 
metrics ( Guillotte and Perron , 2008f ). 



Remark 2.5 (Pickands estimator). A different way to exploit the cxponcntiality 
of the random variables £,i(w) in (|2.3p would be via the Pickands estimator 



1 



1 ™ 



as m 



Pickands (Il98ll). T o i mpose the vertex constr aints A(ej) = 1 , the tech 



niques of lDeheuvelsf (fl99ll ) or lHall and Taividil (|200(j) can be used, see lZhang et al 



( 20081 p. 578). In the bivariate case however, it is known th at the r esult- 
ing estimators are outp erformed by the CFG-estimator A CFG ( Segersl 2007 ; 

Genest and Seeer This is confirmed in the simulation study in lZhang et alJ 

(|2008l Section 3). as well as by our own simulations in SectionJH For this reason, 
we restrict attention here to the family of CFG-estimators. 



3. The OLS-estimator 



The question re mains which weight functions Xj to choose in the CFG- 
estimator (|2.6|) . In Zhang et al.l (|2008l ), the choice \j(w) 



mended as a pragmatic one. The option of using variance-minimizing functions 
Xj was mentioned but not carried out. By casting the estimation problem in a 
linear regression framework, we will obtain an estimator with the same asymp- 
totic performance as the CFG-estimator with those optimal weights. In this 
section, we define the estimator and prove its consistency and asymptotic nor- 
mality, both in the functional sense. In the next section, the gain in efficiency 
is assessed by means of simulations. 



In view of Theorem 12.21 for each w € A p we have 

y/ri(A% FG (w) - A(w)) A(w)r](w), n^oo, 

where r/(w) is a zero-mean normal random variable. We will look for those 
Xj(w) that minimise the variance of r](w). Let £ be the Gaussian process on 
^(Ap) in Proposition ^. 11 For ease of notation, put 

A(«0 = (Ai(ti>), . . . , X p (w)) T 7 C(e) = (C(ci), • ■ • , C(e P )) T , 



G 



the symbol "T" denoting matrix transposition. Then 

var r](w) = var(£(w;) — A(«>) T C( e )) 

= varC(^) - 2 X(w) T E[C(e) ((w)} + X(w) T E[£(e) C(e) T ] A(ti>). 

Note that 

S = S[C(e)C(e) T ] (3.1) 

is the covariance matrix of (— log£(ei), . . . , — log£(e p )) T . Provided this matrix 
is non-singular, var 77(11;) attains a unique global minimum for \(w) equal to 

\°v t (w) = Z- 1 EiC(e)C(w)]. (3.2) 

With this choice of the weight functions, the variance of 

Vopt (w)=((w)-\°v t (w) T C(e) (3.3) 

is equal to 

var?7 opt (u;) = varC(^) - E{((w) C(e) T ] ST 1 E[C(e) £(w)}. (3.4) 

This variance is minimal over all possible choices of weight functions Xj . 

The optimal weight functions A° pt in (|3.2j) depend on the unknown Pickands 
dependence function A. Fortunately, replacing these weight functions by uni- 
formly consistent estimators X n j is just as good asymptotically. For such esti- 
mated weight functions, define the adaptive CFG-estimator by 

p 

logi^f (ti>) = log A n (w) - loginfe), (3.5) 

i=i 

Proposition 3.1 (Adaptive CFG-estimator). Assume that, in addition to the 
assumptions in Provosition \2. 1[ the matrix £ in (|3.ip is non-singular and X n j 
are random elements in ^(A p ) such that, for every j g {1, . . . ,p}, 

sup \\ n ,j(w) — X° pt (w)\ — > 0, n — > 00, almost surely, 

lo£A p 

with A° pt as in (|3.2[) . Then the adaptive CFG-estimator in (|3.5[) satisfies 

sup \A^ F ^(w) — A(w)\ 0, n—>oo, almost surely, (3-6) 
weA p 

and in ^(A p ), 

V^(A^ - A) ~> A V o P t, n 00, (3.7) 
where n op t is the zero-mean Gaussian process defined in (|3.3p . 
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Finally we propose a particularly convenient way to implement the adaptive 
CFG-estimator in (|3.5|) . For w G A p , let /3 n (w) = (/3 ni o(w), . . . ,/3 ntP (w)) T be 
the minimizer in (bo, . . . , b p ) T of 

n , p \2 

X;((-log6(to)-7)-6o-E 6 i(- lo e6( c i)-^)) ' ( 3 ' 8 ) 

In words, f3 n (w) is the ordinary least-squares (OLS) estimator of the vec- 
tor of regression coefficients in a linear regression of the dependent variable 
— log & (w) — 7 upon the explanatory variables — log^(ej) — 7, j e {1, . . . ,p}. 
Define the OLS-estimator of A via the estimated intercept by 

log A° LS (w) = P nt o(w), weA p . 

Since the residuals 

p 

e n ,i(w) = (-log&(u>) -7) -/3„ i0 (w) -^/3 n j(w) (-log&(ej) -7) 
verify 2™=i £n,i(«0 = 0, wc have 

p 

logi° LS (™) = /3„ >0 (™) = logi„(w) login(ei), (3.9) 

that is, the OLS-estimator is equal to the adaptive CFG-estimator with esti- 
mated weights X n j(w) = (3 n j(w). The variance of the (logarithm of the) OLS- 
estimator can be estimated by the sample variance of the residuals, properly 
corrected for the loss in number of degrees of freedom, 

1 " 

<ols(™) = j-XXi(«>). weA p . (3.10) 

n — p — 1 

1 i=i 

Theorem 3.2 (OLS-estimator). Assume that, in addition to the assumptions 
in Provosition \2. 1[ the matrix £ in (|3.1[) is non-singular. Then, with probability 
tending to one, the minimizer (3 n (w) of (|3.8j) is uniquely defined and for j G 
{!,••-, p}, 

sup |/3 nj j(tu) — A° pt (io)| — ► 0, n — * 00, almost surely. (3-11) 

As a consequence, the OLS-estimator in (|3.9[) is uniformly consistent, 

sup |j4° ls (u;) - -> 0, 71^00, almost surely, (3.12) 

and in ^(A p ), 

\/^(i° LS -A) ~* Arfcpt, n^oo, (3.13) 
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where rj opt is the zero-mean Gaussian process defined in (|3.3[) . In addition, the 
variance estimator in (|3 . 1 0[) satisfies 

sup |<7„ ols( w ) ~ var? ?opt(if )| — > 0, n — > oo, almost surely. 

Remark 3.3 (Non-singularity assumption). In the bivariate case, the assump- 
tion that the covariance matrix S in (|3.1[) is non-singular is eq uivalent to th e 
assumption that the copula C is not the comonotone copula ( Segersl . 2007 ). 



We conjecture that in the general multivariate case, a necessary and sufficient 
condition for E to be non-singular is that none of the bivariate margins of C is 
equal to the comonotone copula. 



4. Simulations 



In order to investigate the finite-sample properties of the estimators dis- 
cussed in the previous sections, we generated pseudo-rando m samples fr om 
trivariate extreme- value copulas of logistic type as presented in Tawnl ( 199Clh : 



A(w) = {6 r w\ + (j> r w r 2 ) 1/r + {9 r w r 2 + <t) r w r 3 ) 1/r + (6 r w r 3 + fwl) x/r 

+ ^(w r 1 +w r 2 +w r 3 ) 1/r + 1-9-0-^, w&A p , (4.1) 

for (r,9,<j>,ip) £ [IjOo) x [0, l] 3 - To facilitate compariso ns, we opted for the 
same parameter values as chosen in I Zhang et al. J2008I) : a symmetric case, 
(r, 9, </>, tp) = (3, 0, 0, 1), and an asymmetric one, (r, 9, 0, ip) = (6, 0.6, 0.3, 0). For 
each case 10 000 sample s were generated o f size n £ {50,100,200} using the 



sim ulation algorithms in IStephensonl ([20030 and implemented in the R-package 
evd (|Stephensonl . l2002f ). 

Four estimators were compared: the CFG-estima t or ^4^ with weight func- 
tions Xj(w) = wj (as recommended in lZhang et all 120081 ). the OLS-estimator 
A 9^ s in 113.91) . and t he en hanced versions o f the or iginal Pickands e stimator due 
Deheuvelsl (|l99lh and lHall and Taividil (|2000h as presented in I Zhang et al 



to 



( 2008f ). To visualize the performances of the estimators, we plotted their biases 
and mean squared errors along the line {w £ A p : wi = W2}', see Figures [TJ and O 
for the symmetric and asymmetric logistic dependence functions respectively. 

In accordance to the theory, the OLS-estimator is in virtually all cases consid- 
ered more effi cient than the CFG -estimator. Moreover, our simulations confirm 
the findings in lZhang et al. I (120081 ) that the CFG-estimator is typically more effi- 
cient than the ones of Deheuvels and Hall-Tajvidi. Note that the finite-sample 
bias of the OLS-estimator is somewhat larger than for the other estimators. 
However, thanks to its minimum-variance property it ends up as an overall 
winner in terms of mean squared error. 



Symmetric, n=50 Symmetric, n=50 




0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 

W W 



Symmetric, n=100 Symmetric, n=100 
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0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 

W W 



Figure 1: Biases (left) and mean squared errors (right) of i° LS (io) (solid), A% FG (w) (dashed), 
AjJ T (ixi) (dash-dotted) and A^(w) (dotted) along the line u>i = 102 for 10 000 samples of 
size n S {50, 100, 200} from the trivariate extreme-value copula C with symmetric logistic 
dependence function A(w) = (uij + u>j + w^) 1 ^ at r = 3. 
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Asymmetric, n=50 Asymmetric, n=50 




0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 

W W 



Asymmetric, n=100 Asymmetric, n=100 




0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 

W W 



Figure 2: Biases (left) and mean squared errors (right) of i° LS (» (solid), A% FG (w) (dashed), 
A^ T (w) (dash-dotted) and A^(w) (dotted) along the line u>i = W2 for 10 000 samples of 
size n E {50, 100, 200} from the trivariate extreme-value copula C with asymmetric logistic 
dependence function A in l|4.1|) for (r, 6, <f>, ip) = (6, 0.6, 0.3, 0). 
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Appendix A. Proofs for Section [2] 

Proof of Provosition POl For w £ A p , define f w : (0,oo) p — > ffi by 



/„(»)=- log (/\ 



y g (o,oo)p. 



(A.l) 



We can write 



log A n (w) 



n £—< 

i=l 



Consider the function class & = {f w : w £ A p }. We will show that & is P- 
Donskcr and therefore also P-Glivenko-Cantelli, where P denotes the common 
probability distri bution on (0,oo) p of the random vectors YV According to 
Theorem 2.6.8 in van der Vaart and Wellnei ( 1996h and the proof thereof, we 
need to verify that & is a pointwise separable Vapnik-Cervonenkis-class (VC- 
class) that admits an envelope function with a finite second moment under 
P. Pointwise separability follows from the fact that the map w i— ► f w (y) is 
continuous in w £ A p for each y £ (0, oo) p . The VC-property can be established 
by repeated applications o f Lemm as 2.6.15 and 2.6.18, items (i) and (viii), in 
van der Vaart and Wellner (Il996l) . Finally, the readily established bound 



log A 



Vj 



j'=i 



^5 max 



log A Vi 



log(p) 



E 



log% 



(A.2) 



yields an envelope function of & all of whose moments are finite under P. 
Observe that the distribution of Aj=i I s Exponential with mean equal to 
{pAil/p,...,!^)}- 1 £ [l/p,l]. 

From the fact that & is P-Glivenko-Cantclli it follows that 



sup \\o&A n (w 

i»6A„ 



sup 



logvlO)| 

n 

77 ^ 



0. 



almost surely. 



(Here, we dropped a subscript i for convenience.) Continuity of the map exp : 
i. p ) — > ^(Ap) : / i — ► exp(/) yields uniform consistency as in (|2.10|) . 
Moreover, the P-Donskcr property entails 



VH(log^ FG -logA)^C, 



(A.3) 



in the space t°°(A p ) of bounded functions from A p into R equipped with the 
topology of uniform convergence, where wc identified with A p . The process 
£ is zero-mean Gaussian with covariance function given in (|2.12[) . The sample 
paths of the limit process £ are continuous with respect to the standard deviation 
(semi-)metric p on A p defined by 



p(v,w) = [var{/„(Y)-/ w (F)} 



1/2 



v, w £ A, 
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If linin^oo v„ = v in A p according to the Euclidean metric, then by continuity 
of f w (y) in w and by uniform integrability, also liirin^oo p(v n , v) = 0. (Uni- 
form intcgrability is checked by using the bound in (|A.2[) .) It follows that the 
trajectories of £ are also continuous with respect to the Euclidean metric on A p , 
that is, £ actually takes its values in ^(A p ). As the trajectories of the left-hand 
side in (|A.3[) are continuous too, the convergence in (|A.3[) takes place not only 
1°°{A P ) but also in <*f(A p ). 

The convergence in (|2.1ip follows from the Hadamard-differentiability of the 
map exp : *^(A P ) — > *^(A P ) : f t— > exp/ and the functional delta-method 
van der Vaart and Wellnerl . 1 19961 Section 3.9). □ 



Proof of Theorem \2.1A Uniform consistency of A GFG in (|2. 13[> follows from uni- 
form consistency of A n in (|2.10[) and the fact that the functions \j are contin- 
uous, hence bounded. 

To show (f2~T4j) . define L : tf(A p ) -> tf(A p ) by 

Lf(w) = f(w)- y £,X j (w)f(e j ) 

for / G ^(A p ) and iu £ A p . The operator L is linear and bounded. We have 
logA CFG = LQogAn). Moreover, as A(ej) = 1 for all j G {l,...,p}, also 
L(logA) = log A. We find 

\/^(logi CFG - log A) = L(Vn(logi„ - log A)) ~» LC = »?, n oo. 



The w eak convergence in ()2.14j) follows from the functional delta-method (jvan der Vaart and Wellner 



19961 Section 3.9). The representation 77 = LC, coincides with (|2.15|) . □ 



Appendix B. Proofs for Section [3] 

Proof of Proposition \3.1i If the optimal weight functions A° pt were known, we 
could consider the optimal CFG-cstimator 

p 

log A% F %(w) = logi n (to) - ^ pt M logA n ( ej ), w G A„. 

j=i 

By Theorem [221 the optimal CFG-cstimator is uniformly consistent (|2.13|) and 
is asymptotically normal in the sense of (|2.14j) with rj = ?y pt- Now 

llogi^H-logi^HK^lA^H-Af^)! |logi n ( ej )|. 

3=1 

By uniform consistency of A„.j and asymptotic normality of \fn log A n (ej), we 
obtain, as n — > 00, 

sup I log i GFG (to) - log i ^ (w)\ -> 0, almost surely, 

id6A p 

sup I log i GFG (to) - log i CFG Who. 
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As a consequence, the adaptive CFG-estimator is uniformly consistent 
asymptotically normal (|3.7p . 



and 
□ 



Proof of Theorem \3.Sl In analogy to the linear regression framework, define the 
n x (p + 1) matrix 



X = 



i -log£i( e 
1 -log£„(ei; 



7 
■7 



log£i(e P ) - 7 N 
log£ n (e p ) - 7y 



and the n x 1 vector 

Y(w) = (-log£i(u>) - 7, 



-log£„(» -7) 



to e A„. 



(No confusion should arise between this Y(w) and the random vectors Y i in 
(|2.ip .) Provided the matrix X T X is non-singular, the OLS-estimator f3 n (w) is 
given by 

f3 n (w) = (X T X)- 1 X T Y(w). 
Recall the functions f w in (|A.1[) . For t^io G A p , define g v ^ w : (0,oo) p - 



by 



9v, w (y) = f v (y) f w (y), ye(0,oo) p . 



By (fA~2|l and by Example 2.10.23 in Ivan der Vaart and Wellnerl (|l996l ). the 
function class : w,w € A p } is P-Donsker and thus P-Glivenko-Cantelli, 

where P is the common distribution on (0,oo) p of the random vectors Y 1. It 
follows that, almost surely as n — > 00, 



sup 



-X T Y(w) 



As £ is non-singular, we have 



1 
S 



1 
S 

/ logA(tw) 

U[c(e)cn: 



1 

ST 1 ' 



(B.l) 
(B.2) 



while —X T X is with probability tending to one a non-singular matrix too. We 
find, almost surely and uniformly in w <G A p , 



-X X 

n 



1 



n 

1 

XT 1 



-X ' Y(w) 



log A(w) 



(log A(w] 



E[C(e)C(w)]J ~ y\°*\w) 



Equation (|3.1ip follows. Proposition l3.1l and equation (|3 ,9|) then yield equations 
flgZEIJ) and ([3~T3]) . 
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Finally, for the estimation of the variance, note that it does not matter 
asymptotically if we divide by n or by n — p — 1. Elementary calculations yield 

1 " 1 

- ]Te; 2 M (™) = -(n™) - XPnMV (Y(w) - Xp n {w)) 

i=l 

= -Y(w) T Y(w) - ( -X T Y(w)] ( -X T x] -X T Y(w). 

n \n J \n J n 

The Glivenko-Cantelli property yields, almost surely and uniformly in w £ A p , 

I 1 ™ 

-Y(w) T Y(w) = - £(- lo S^(™) - 7) 2 
n n * — ' 

i—l 

—> E[(-log£i(w) - 7) 2 ] = varC(w) + (log A(w)) 2 , n -> oo. 

In combination with (jB.ip and (|B.2[) , we obtain that iC 10 ) converges 

almost surely and uniformly in id € A p to 

= varC(w) - £[C(e) T C(™)] ST 1 E[C(e)((w)}, 
which by (|3.4[) is equal to var77 opt (io). □ 
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